Vectorized OpenCL implementation of numerical integration for higher order finite elements
نویسندگان
چکیده
In our work we analyze computational aspects of the problem of numerical integration in finite element calculations and consider an OpenCL implementation of related algorithms for processors with wide vector registers. As a platform for testing the implementation we choose the PowerXCell processor, being an example of the Cell Broadband Engine (CellBE) architecture. Although the processor is considered old for today’s standards (its design dates back to year 2001), we investigate its performance due to two features that it shares with recent Xeon Phi family of coprocessors: wide vector units and relatively slow connection of computing cores with main global memory. The performed analysis of parallelization options can also be used for designing numerical integration algorithms for other processors with vector registers, such as contemporary x86 microprocessors. We consider higher order finite element approximations and implement the standard algorithm of numerical integration for prismatic elements. Original contributions of the paper include the analysis of data movement and vector operations performed during code execution. Several versions of the implementation are developed and tested in practice.
منابع مشابه
Numerical resolution of conservation laws with OpenCL
We present several numerical simulations of conservation laws on recent multicore processors, such as GPU’s, using the OpenCL programming framework. Depending on the chosen numerical method, different implementation strategies have to be considered, for achieving the best performance. We explain how to program efficiently three methods: a finite volume approach on a structured grid, a high orde...
متن کاملAbsolute_Performance.eps
The paper presents investigations on the implementation and performance of the finite element numerical integration algorithm for first order approximations and three processor architectures, popular in scientific computing, classical CPU, Intel Xeon Phi and NVIDIA Kepler GPU. A unifying programming model and portable OpenCL implementation is considered for all architectures. Variations of the ...
متن کاملA novel modification of decouple scaled boundary finite element method in fracture mechanics problems
In fracture mechanics and failure analysis, cracked media energy and consequently stress intensity factors (SIFs) play a crucial and significant role. Based on linear elastic fracture mechanics (LEFM), the SIFs and energy of cracked media may be estimated. This study presents the novel modification of decoupled scaled boundary finite element method (DSBFEM) to model cracked media. In this metho...
متن کاملNumerical integration on GPUs for higher order finite elements
The paper considers the problem of implementation on graphics processors of numerical integration routines for higher order finite element approximations. The design of suitable GPU kernels is investigated in the context of general purpose integration procedures, as well as particular example applications. The most important characteristic of the problem investigated is the large variation of r...
متن کاملBacon: A GPU Programming Language With Just in Time Specialization (Draft)
This paper describes Bacon, a data-parallel programming system targeting OpenCL-compatible graphics processors. This system is built upon the existing OpenCL standard in order to make it easier for programmers to write high performance kernels for GPU accelerated applications. The OpenCL C syntax is extended into a new language, Bacon C, intended to make development significantly more convenien...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computers & Mathematics with Applications
دوره 66 شماره
صفحات -
تاریخ انتشار 2013